-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-25877][k8s] Move all feature logic to feature classes. #23220
Conversation
This change makes the driver and executor builders a lot simpler by encapsulating almost all feature logic into the respective feature classes. The only logic that remains is the creation of the initial pod, which needs to happen before anything else so is better to be left in the builder class. Most feature classes already behave fine when the config has nothing they should handle, but a few minor tweaks had to be added. Unit tests were also updated or added to account for these. The builder suites were simplified a lot and just test the remaining pod-related code in the builders themselves.
Kubernetes integration test starting |
Test build #99687 has finished for PR 23220 at commit
|
Kubernetes integration test status success |
Test build #99800 has finished for PR 23220 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status success |
So, anybody interested in reviewing this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main code looks a lot better, thanks! I have a concern about the tests.
val spec = pod.pod.getSpec | ||
assert(!spec.getContainers.asScala.exists(_.getName == "executor-container")) | ||
assert(spec.getDnsPolicy === "dns-policy") | ||
assert(spec.getHostAliases.asScala.exists(_.getHostnames.asScala.exists(_ == "hostname"))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: The second call to exists
can be contains
instead, so that we don't pass a function object that ignores the argument. Alternatively, both exists
calls can be removed:
spec.getHostAliases.asScala.flatMap(_.getHostnames).contains("hostname")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not modifying this code, just moving it from its previous location.
assert(metadata.getNamespace === "namespace") | ||
assert(metadata.getOwnerReferences.asScala.exists(_.getName == "owner-reference")) | ||
val spec = pod.pod.getSpec | ||
assert(!spec.getContainers.asScala.exists(_.getName == "executor-container")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Use .asScala.map(_.getName).contains("executor-container")
. Similar changes come up throughout this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not modifying this code, just moving it from its previous location.
kubernetesClient | ||
} | ||
|
||
private def verifyPod(pod: SparkPod): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The number of things this test checks is remarkable, and it is very much possible to accidentally omit checking the application of a specific feature when a new one is added for either the driver or executor. This is why we had the overridable feature steps in the original incarnation of these tests. Not mocking the substeps leads us to need to check that some specific aspect of each step has been applied. Can we go back to mocking the different steps so that this test can be more easily modified when we add more features? Or else can we abstract away the idea that these steps are applied without this test itself needing to know what the step itself actually does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another factor of my concern about is that for each individual assertion, it is unclear which step the assertion is tied to. This reads a lot more like an ETE test than a unit test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually did not write this test. I copy & pasted it with zero modifications from the previous class, and I'd prefer to keep it that way.
That's also an argument for not restoring the mocks, which would go against what this change is doing. This test should account for modifications made by other steps, since if they modify something unexpected, that can change the semantics of the feature (pod template support).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's also an argument for not restoring the mocks, which would go against what this change is doing. This test should account for modifications made by other steps, since if they modify something unexpected, that can change the semantics of the feature (pod template support).
Wouldn't most of those unexpected changes come from the unit tests of the individual steps? Granted this test can catch when a change in one step impacts behavior in another step, which is important. Given that this isn't changing prior code I'm fine with leaving this as-is and addressing again later if it becomes a problem.
+1 from me, would like @liyinan926 to take a second look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Merging to master |
This change makes the driver and executor builders a lot simpler by encapsulating almost all feature logic into the respective feature classes. The only logic that remains is the creation of the initial pod, which needs to happen before anything else so is better to be left in the builder class. Most feature classes already behave fine when the config has nothing they should handle, but a few minor tweaks had to be added. Unit tests were also updated or added to account for these. The builder suites were simplified a lot and just test the remaining pod-related code in the builders themselves. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes apache#23220 from vanzin/SPARK-25877.
This change makes the driver and executor builders a lot simpler by encapsulating almost all feature logic into the respective feature classes. The only logic that remains is the creation of the initial pod, which needs to happen before anything else so is better to be left in the builder class. Most feature classes already behave fine when the config has nothing they should handle, but a few minor tweaks had to be added. Unit tests were also updated or added to account for these. The builder suites were simplified a lot and just test the remaining pod-related code in the builders themselves. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes apache#23220 from vanzin/SPARK-25877.
This change makes the driver and executor builders a lot simpler
by encapsulating almost all feature logic into the respective
feature classes. The only logic that remains is the creation of
the initial pod, which needs to happen before anything else so
is better to be left in the builder class.
Most feature classes already behave fine when the config has nothing
they should handle, but a few minor tweaks had to be added. Unit
tests were also updated or added to account for these.
The builder suites were simplified a lot and just test the remaining
pod-related code in the builders themselves.